Skip to content

Feat/v2 e2e tests#3

Open
ev-shindin wants to merge 5 commits intomainfrom
feat/v2-e2e-tests
Open

Feat/v2 e2e tests#3
ev-shindin wants to merge 5 commits intomainfrom
feat/v2-e2e-tests

Conversation

@ev-shindin
Copy link
Copy Markdown
Owner

test

Add e2e test jobs for the V2 saturation engine that mirror the existing
V1 tests. V2 smoke tests run automatically on every PR alongside V1,
while V2 full tests are triggered by ChatOps comments with -v2 suffix
(/trigger-e2e-full-v2, /test-e2e-full-v2, /test-full-v2).

Changes:
- CI: add check-full-tests-v2, e2e-tests-smoke-v2, e2e-tests-full-v2 jobs
- CI: update report-status to emit separate V1/V2 status contexts
- Makefile: pass ANALYZER_NAME to deploy-e2e-infra
- Helm: conditionally render analyzerName in ConfigMap template
- Helm: add analyzerName field to values.yaml and values-dev.yaml
- install.sh: accept ANALYZER_NAME env var for helm --set
parseSaturationConfig() called Validate() without first calling
ApplyDefaults(), causing V2 configs with analyzerName: saturation
to fail validation because scaleUpThreshold/scaleDownBoundary
default to zero (omitempty) and Validate() rejects zero values.

This caused the engine to skip all models with "Saturation scaling
config not loaded yet for namespace", resulting in no scaling decisions.
When the model server does not emit the vllm:cache_config_info metric
(e.g., llm-d-inference-sim), TotalKvCapacityTokens is 0 and the V2
analyzer skipped the replica entirely, resulting in totalDemand=0 and
no scale-up decisions.

Add computeReplicaCapacityFallback that uses the deployment-derived
capacity from the capacity store and estimates demand from KvCacheUsage
percentage. This allows V2 to produce scaling decisions with any
vLLM-compatible server, not just those emitting cache_config_info.
Add /ok-to-test-v2 and /retest-v2 ChatOps commands that deploy with
analyzerName: saturation for V2 capacity-constraint analyzer testing
on real GPU hardware. Also adds analyzer_name workflow_dispatch input.
@github-actions
Copy link
Copy Markdown

This PR is marked as stale after 21d of inactivity. After an additional 14d of inactivity (7d to become rotten, then 7d more), it will be closed. To prevent this PR from being closed, add a comment or remove the lifecycle/stale label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant